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Abstract Multiply lensed sources experience a relative time 
delay in the arrival of photons. This effect can be used to 
measure absolute distances and the Hubble constant (Hp) 
and is known as time-delay cosmography. The methodol- 
ogy is independent of the local distance ladder and early- 
universe physics and provides a precise and competitive mea- 
surement of Ho. With upcoming observatories, time-delay 
cosmography can provide a 1% precision measurement of 
Ho and can decisively shed light on the current reported 
*Hubble tension’. This paper presents the theoretical back- 
ground and the current techniques applied for time-delay 
cosmographic studies and the measurement of the Hubble 
constant. The paper describes the challenges and systemat- 
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ics in the different components of the analysis and strategies 
to mitigate them. The current measurements are discussed in 
context and the opportunities with the anticipated data sets 
in the future are laid out. 
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1 Introduction 


Time-delay cosmography uses measurements of relative ar- 
rival times of multiply lensed sources to measure an abso- 
lute distance of the Universe. The method was originally 
proposed by over half a century ago, prior 
to the discovery of the first extragalactic gravitational lens. 
Time-delay cosmography provides a one-step measurement 
of the Hubble constant, completely independent of the lo- 
cal distance ladder or probes anchored with sound horizon 
physics, such as the cosmic microwave background (CMB). 

Almost a century after its first measurement, the Hubble 
constant Hp still remains arguably the most debated number 
in cosmology. In the past few years, a tension has emerged 
between a number of local measurements, and inferences 
from early-Universe probes such as the cosmic microwave 
background (CMB) and Big Bang Nucleosynthesis, under 
the assumption of flat A cold dark matter (ACDM) cosmol- 


ogy (see, €.g., 2019})Shah et al.|/2021| for recent 
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summaries). If this tension is real, and not due to unknown 
systematic uncertainties in multiple measurements and their 
analyses, it implies that the standard ACDM model is not 
sufficient and new physical ingredients beyond this model 
are required. From a theoretical standpoint, a number of 
possible solutions — for example, involving early dark en- 


ergy — have been proposed (e.g., |Knox and Millea| |2020 
Di Valentino et al.|/2021}|Schoneberg et al.||2021) and refer- 


ences therein), often requiring fine-tuning of free parameters 
not to violate other observational constraints. From an obser- 
vational standpoint, besides improving the precision of the 
measurements, significant attention has turned to the sys- 
tematic investigation of unknown systematic uncertainties 


(c.g. [Riess et al 2019] Freedman et al 2079] 2020} Riess 
Morisell et al} 2021p. 

This manuscript details the general methodology devel- 
oped over the past decades in time-delay cosmography, dis- 
cusses recent advances and results, and, foremost, provides a 
foundation and outlook for the next decade in providing ac- 
curate and ever more precise measurements with increased 
sample size and improved observational techniques. This 
manuscript will be submitted as a chapter of a Space Science 
Reviews, Topical Collection "Strong Gravitational Lensing", 


eds. J. Wambsganss et al. We will refer throughout this manuscript 


to other forthcoming chapters of the same collection cover- 
ing a wide range of strong lensing theory and applications 
(e.g., Saha et al. (in preparation), Shajib et al. (in prepara- 
tion), Suyu et al. (in preparation), Vernardos et al. (in prepra- 
tion), McMahon et al. (in preparation)). We refer to, e.g., 
(2016): @OI8) to provide 
more in-depth historical perspectives on the early years of 
the field and Treu, Suyu & Marshall, 2022, submitted, to a 
broader and less technical perspective on the opportunities 
of time-delay cosmography in this decade. 

We discuss the methodology around lensed quasars as 
source objects to perform time-delay measurements because 
quasars are currently the most established sources with the 
most relevant current results. We refer to[Oguri| (2019) and 
Suyu et al. (in preparation) using, for example, lensed su- 
pernovae or other transient phenomena that can be utilized 
to perform time-delay cosmographic measurements. 

This manuscript is organized as follows: In Section[2|we 
provide the general concept and physics to turn relative time 
delays between multiple images of the same source into dis- 
tance measurements. Section|3}provides an overview of the 
required analysis ingredients for individual lenses. Subse- 
quent sections go into the details of these ingredients, the 
time-delay measurement (Section 4). determination of the 
lensing potential (Section [5). and the study of the line of 
sight (LOS) of the lenses (Section [6). Section [7] describes 
the cosmographic inferences and how to utilize a sample of 
lenses to perform an Ho inference. We summarize the cur- 
rent status and results obtained in Section | Lastly, in Sec- 


tion [10] we look in the future and discuss the potential and 
challenges lying ahead for the community. 


2 Time delays and the time-delay distance 


The deflection of light due to mass over- or under-density in 
the Universe can be captured in the lens equation (see Saha 
et al. (in preparation) for the detailed theory, including that 
this formula is only valid for small angles) 


B= 0-6), (1) 


where £ is the angular position of the source without the 
lensing effect, 6 is the corresponding angular coordinate on 
the sky as seen when lensed, and a@ is the deflection angle 
that maps the image position to the source position in an- 
gular coordinates as seen from the observer. There exists a 
scalar potential, the lensing potential ¢, such that the gradi- 
ents correspond to the deflection field 


Vo(0) = a(6). (2) 


The convergence of the potential ¢, x, is half the Lapla- 
cian 


1 
KO) = V°G(6) (3) 


and is given in the thin lens approximation for small angles 
as 


2(0) 
Jcrit 


kK(9) = (4) 
with X(@) as the projected mass over- or under-density with 
respect to the mean background density and 2,,; the critical 
surface density|"| 


Dy 


Serit = ————, 
me 4nGDasDa 


(5) 
where c is the speed of light and G the gravitational con- 
stant. Dg is the angular diameter distance to the lens, D, is 
the angular diameter distance to the source, and Da, is the 
angular diameter distance between the lens and the source, 
respectively. 

When the intensity of a strongly lensed background source 
varies over time, such as an active galactic nucleus (AGN), 
the variability pattern manifests in each of the multiple im- 
ages and is delayed in time due to the different light paths of 
the different images (see Figure [ip. The relative arrival time 
between two images 0, and 6g, Atag, originated from the 
same source # is 


tgs =“ [1(09.B) - 105,)]. (6) 


' not to be confused with the critical density of the universe 
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mildly dependent on the relative expansion history from cur- 
rent time (z = Q) to the redshift of the deflector and the 
source (dependance on E(z) in Eqn. [12}. Time-delay cos- 
mography is primarily an absolute distance anchor and at the 


is the Fermat potential (Schneider]/1985} {Blandford and NarayanPeart of the current tension in cosmology. The mild depen- 


where 
6- 2, 
76,8) =| °F” — 0) @) 
(1986), and 
DgD,s 
Da, = (1 + za) a (8) 


is the time-delay distance (Refsdal| |1964} |Schneider et al. 
1992 2010). The Fermat potential (Eqn.[7) con- 


sists of two terms, a geometric term reflecting the geometric 
path difference, and a potential term, capturing the differ- 
ence in the local spacetime dilation, known as the Shapiro 
delay. The optical terms stated (such as the Fermat potential) 
are only valid under small-angle and thin-lens assumptions. 
We refer to Saha et al. (in preparation) for these assump- 
tions and the more general multi-plane formalism. 

The angular diameter distance between two redshifts z, 
and z in an Friedmann—Lemaitre—Robertson—Walker (FLRW) 
metric is 


1 
D(z, 22) = Ta fk E122) (9) 
+29 
where 
Cc ro) dz’ 
x(aita) = [ nn (10) 


is the comoving distance with E(z) = H(z)/Hp as the dimen- 
sionless Friedman equation and 


K-12 sin (K'/2y) for K > 0 
Sk) =3X for K =0 
(-K)~!/? sinh [Kx] for K <0 


(11) 


is the spatial curvature of the background metric. 

In the ACDM cosmology with density parameters Q, 
for matter, Q; for spatial curvature, and Q, for dark energy 
described by the cosmological constant A, the dimensionless 
Friedman equation (F(z), Eqn.[I0) is given by 

1/2 
E(Z) = (Qu(1 + 2)? + QC + 2)? + Qa) (12) 
and the spatial curvature is K = -Q,H5 ce 

Constraints on the Fermat potential difference ATap and 
a measured time delay Atap allows one to constrain the time- 
delay distance Dy,. This absolute physical distance anchors 
the scale in the Universe within the redshifts involved in the 
lensing configuration. The Hubble constant is inversely pro- 
portional to the absolute scales of the Universe (see e.g., 
Eqn.[10) and thus scales with Dj, as 


-1 
Ho o Diy» 


(13) 


dency on other cosmological parameters beyond Hp can be 
compensated with other cosmological probes that are sensi- 
tive to the relative expansion history (such as SNIa luminos- 
ity distances, eg. 
(2019), or with a large set of gravitational lenses at different 
lens and source redshifts. 

While the time delay At,z can be directly measured (see 
Section (4). the relative Fermat potential 4T,g is not a di- 
rect observable. The primary information to infer ATap are 
positional constraints and extended distortions in the lensed 
arcs from the lensing effect. However, there are degenera- 
cies inherent in gravitational lensing that limit the amount 
of information accessible by lensing distortions (e.g., 
1988 
2006) 
2016} (2017) [Birrer]|2021) and Saha et 
al. (in preparation). 

The most prominent lensing degeneracy impacting the 
time-delay prediction is the mass-sheet degeneracy (MSD; 
Falco et al.||/1985). The MSD is a multiplicative transform 
of the lens equation (Eq.[I) which preserves image positions 
(and any higher order relative differentials of the lens equa- 
tion) under a linear source displacement B — AB combined 
with a transformation of the convergence field 


K,(8) = AK(O) + (1 — A). 


aay 


(14) 


Eq. above is a mathematical transformation where the 
term (1—A) = Kmst iS equivalent to an infinite sheet of conver- 
gence mass, and hence the name mass sheet transform. Kmst 
can be positive and negative, since it is defined relative to 
another density profile and also relative to the average den- 
sity in the universe. Only observables related to the unlensed 
apparent source size, the unlensed apparent brightness, or 
the lensing potential are able to break this degeneracy. Thus, 
the same relative lensing observables can be predicted if the 
mass profile is scaled by the factor A with the addition of a 
sheet of convergence (or mass) of Kmst(@) = (1 — A). 

The difference of Fermat potentials between a pair of 
images A& B (Eq.[7) scales with 2 as 


Atap.a = AATaAB, (15) 
and so does the time delay as 
Atapa = AAtap. (16) 


When transforming a lens model with an MST, the inference 
of the time-delay distance (Eqn. |8) from a measured time 
delay and inferred Fermat potential transforms as 


Dig ea Dx. (17) 
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Fig. 1 Illustration of the light path of the quadruply imaged lensed quasar HE0435-1223. The different light paths result in different arrival times. 
The relative time delays between the images is directly proportional to the overall physical distances from the observer to the lens and source. 
Measuring the time delays and reconstructing the lensing effect allow one to measure an absolute scale in the universe. Graphics from: Martin 


Millon, Image from Hubble Space Telescope 2017). 


Thus, the Hubble constant, when inferred from the time- 
delay distance D4,, transforms as (from Eqn.[13) 
Ao, = AH. (18) 

An MSD effect relative to a proposed deflector model 
might occur either within the mass distribution of the main 
deflector, referred as internal MSD with Ain, or being caused 
by inhomogenities along the line of sight (LOS) of the strong 
lens system. 

Mass over- or under-densities along the LOS of the strong 
lensing system cause, to first order, shear and convergence 
lensing perturbations. Reduced shear distortions do have a 
measurable imprint on the azimuthal structure of the strong 
lensing arcs (see e.g., [Birrer| while the convergence 
component of the LOS, denoted as kext, is equivalent to an 
MST, and thus not directly measurable from imaging data. 

We define D*"’ as the specific angular diameter distance 
along the specific line of sight of the lens being impacted 
by LOS structure and D>‘? as the angular diameter distance 
from the homogeneous background metric without any per- 
turbative contributions. D!°"’ and D'S are related through 
the convergence terms as 


Dies = (1 — xq)D* 
bk, 

pe = (1 _ Ks)Dy g 

Dig = (1 — kas)D 9° 

(19) 


where x, is the external convergence from the observer to the 
deflector, x, from the observer to the source, and xg, from the 
deflector to the source, respectively (Birrer et al.|/2020). 
The lensing kernel impacting the time delay can be de- 
scribed as the product of three different angular diameter 


distances entering D4, in Equation 2020 
Fleury et al.|/2021a), 


Cl = ka)Q = ks) 


1 kas ve 


1 — Kext = 


We note that, also directly visible from the equation above, 
the lensing efficiency (see Saha et al. (in preparation)) im- 
pacting the linear distortions for both shear and Kex is differ- 
ent from the standard weak lensing efficiency in the absence 


of a strong lensing deflector (McCully et al. 2014} |2017 
2017||2020} {Fleury et al.|{2021b). 


MSD uncertainties or biases may also arise relative to as- 
sumptions made in the radial density profile of the main de- 


flector galaxy (see e.g. [Kochanek}/2002} Saha and Williams| 
2006) 2007 2013} [Coles 
etal 2014)(Xu et al] 2016 2016}|Unruh etal 
017}|Sonnenfeld}/2018 2020}|Blum et al.}|/2020 

The main aspect of this internal mass profile degeneracy 
component can be approximated to first order with an inter- 
nal MSD and its parameter 2; relative to an assumed mass 
profile. 

The total MST, the relevant transform to constrain for 
an accurate Fermat potential determination and Hp measure- 
ment, is the product of the internal and external MST (e.g., 


Be 
- 
o 


N 
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Schneider and Sluse}|2013 2016} |2020)} 


A= qd _ Kext) x Aint. (21) 


The MST and its generalizations imply that one has to 
rely either on non-lensing information or assumptions about 
its functional form to constrain the radial mass density pro- 
file with a sufficient level of precision for time-delay cos- 
mography. 

The external line-of-sight lensing contribution can be es- 
timated by tracers of the large-scale structure, either using 


galaxy number counts (e.g.,{Greene et al.|/2013 
2017), or weak-lensing measurements (Tihhonova et al.}|2018 


(2020). These measurements, paired with a cosmological model 
including a galaxy—halo connection are able to constrain the 
probability distribution of kex, to a few per cent (in terms of 
impact on Dy4;) per sight line. 

Among those observations that are sensitive to the total 
MSD a, stellar kinematics is the most prominent and com- 
monly used one. The dynamics of stars is a direct tracer of 
the three-dimensional gravitational potential and provides 
an independent mass estimate. Joint lensing+dynamics con- 
straints have been used to provide measurements of galaxy 


mass profiles (eg. (Grogin and Narayan||1996)[Romanowsky 
(2012). The modelling of the 
kinematic observables in lensing galaxies range in complex- 
ity from spherical Jeans modeling 


2008) to Schwarzschild (Schwarzschild}|1979) methods. 
Regardless of the approach, the prediction of the LOS 


velocity dispersion o-, from any model can be decomposed 
into a cosmology-dependent and cosmology-independent part, 


as (see e.g.,|Birrer et al.|[2016] {2019) 


1-k, 
2 C’I(Ejenss Banis A Aint), 


oO 


~ 4] — Kds = ea 
where c is the speed of light, J is a dimensionless quantity 
dependent on the deflector model parameters (&jens), the stel- 
lar anisotropy distribution (B,ni) and the observational con- 
ditions and luminosity-weighting within the aperture (e.g., 
(2010). The internal component Ain, should be physi- 
cally interpretable as a three-dimensional mass profile and 
incorporated into the kinematics modeling term J, in partic- 
ular when there are multiple aperture measurements avail- 


able (‘Teodor et al.}/2022). In the approximate case of a very 


extended sheet-like perturbation, we can approximate 


J (Etens> Bani» Aint) od AintI Etens> Bani): 


Joint lensing+dynamics constraints are either able to con- 
strain the MSD component of the deflector model, or pro- 
vide additional cosmographic constraints on the relative ex- 
pansion history through the involved angular diameter dis- 
tance ratio. When adding a time delay, the joint cosmographic 


(23) 


constraints from a time-delay+lensing+dynamics analysis 
can be translated into a two-dimensional angular diameter 
distance plane (Birrer et al.|/2016][2019). When mapped into 
the D4,—Dg plane, the projection on constraints in a is in- 
variant i any pure external MSD parameter Kext ( 


9) Jae al 2015 Bier eta 2079] Ved 
ecal}pozify 


3 Overview on analysis ingredients 


To measure the distances D4;, or more general the D4,—Da 
combination, from a time-delay lens system for cosmogra- 
phy, we need the following data products: 


discovery of a lens with a time-variable source, 
spectroscopic redshifts of the lens, zg, and the source, Zs, 
time delays between the multiple images, 

lens mass model to determine the Fermat potential, 

lens environment studies to constrain external lensing ef- 
fects related to the mass-sheet degeneracy. 


Ci 


The dataset required for each step are observationally 
cheap in comparison to other cosmological probes. How- 
ever, the combined analysis, even of a single lens, requires 
the coordination of multiple independent observations. The 
analysis can be impossible or severely limited in its preci- 
sion and reliability by a single missing ingredient. For ex- 
ample, without measurements of a time delay, no constraints 
on absolute distances involved can be inferred, and thus, re- 
gardless of the approach chosen, no direct constraints on the 
Hubble constant can be achieved. 


The spectroscopic redshifts of the quasar sources, z;, are 
often easy to obtain given the frequent emission lines in 
quasars. The redshift of the lens, zg, can be challenging to 
measure since the bright quasar images can outshine the 
lens galaxy. Getting zq of lensed quasar systems often re- 
quire spectra taken under good seeing condition, to deblend 
the lensing galaxy from the quasar. Technically, the redshifts 
involved in the lensing system are not directly required for 
the distance measurement. However, for the cosmological 
interpretation of the obtained distances, the redshifts are of 
crucial importance. 


We describe in the next sections the remaining three in- 
gredients; time delays (Section |4p, lensing potential (galaxy 
scale and cluster) (Section|5), and line-of-sight perturbations 
(Section |6). 


2 Dg is still dependent on the LOS between observer and lens, i.e. 


on Ka (Eqn. [T9}. 
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4 Measuring time delays 
4.1 Monitoring of lensed quasars 


Lensed quasars were quickly identified as excellent sources 
for time-delay cosmography as they are variable on short 
timescale, making the time-delay measurements possible, 
and they are sufficiently bright to be observed at cosmologi- 
cal distances. They are also much more common than lensed 
supernovae as around 300 lensed quasars have been discov- 
ered at the time of writing compared to only four lensed su- 
pernovae (see Suyu et al. (in preparation)). Lensed quasars 
are typically found at redshift z, ~ 1-3, lensed by massive 
early-type galaxies located around redshift zq ~ 0.2 — 0.8 
(e.g., and McMahon et al. (in prepa- 
ration)). This configuration typically produces multiple im- 
ages separated by a few arcseconds, which is sufficient to be 
resolved with small ground-based telescope. The monitoring 
of lensed quasars is thus challenging but possible with 1-m 
or 2-m class telescope, provided that a regular and long-term 
access is guaranteed (see e.g. the COSMOGRAIL collabo- 


ration |Eigenbrod et al.| {2005 2011 
2013a). It is however limited by several astrophysical, 


observational and instrumental factors, that are listed below. 


Photometric accuracy : In the optical, most quasars are vari- 
able on a timescale of weeks to years, and the longest vari- 
ations also have the largest amplitude. This means that ei- 
ther long-duration light curves or high photometric accu- 
racy are required to measure the delay reliably. In one vis- 
ibility season, variations of the order of 0.2 mag are often 
observed, which requires a photometric accuracy of a few 
milli-magnitudes to identify precisely the inflexion points. 
These inflections points are essential features in the light 
curves since it is not possible to measure a time delay if 
the quasars does not display any variations, or if the first 
derivative remains always constant. Reaching a photomet- 
ric accuracy of only a few milli-magnitude is challenging as 
the quasar images are often blended with extended sources 
such as gravitational arcs or the lens galaxy. Consequently, 
the reconstruction of the Point Spread Function (PSF) and 
proper treatment of the contaminating light from these ex- 
tended sources are usually the key to reduce the noise in the 
light curves. 


Monitoring cadence and duration of the monitoring: A good 
sampling of the light curves is necessary if one targets the 
fast variations of small amplitudes. The monitoring cadence 
then needs to be commensurate with the timescale of the tar- 
geted variations. The duration of the monitoring campaign 
also needs to be sufficient to cover the duration of the time 
delays and to ensure that enough variations of the quasar are 
recorded. This requires continuous access to the telescope 


for at least one visibility season, which is typically 6 to 8 
months. 


Windowing effects and correlated noise: Seasonal gaps are 
often unavoidable in optical light curves since only circum- 
polar targets are observable all year-long. The fact that data 
are missing every year can introduce some windowing ef- 
fects, which should be accounted for when using cross-cor- 
relation techniques to measure time delays. The missing data 
introduces a periodic signal that must be carefully removed 
or taken into account before attempting to measure the time 
delays. Additionally, great care should be taken in the pres- 
ence of correlated noise, which is often present in the light 
curves due to cross-talk between the quasar images. If no 
evident variations can be matched unambiguously in both 
light curves, it is unlikely that any statistical methods will 
robustly measure a time delay. 


Extrinsic variability: Extrinsic variations are often observed 
in the light curves. They are caused mainly by the microlens- 
ing of the quasar images, and also a variety of other as- 
trophysical effects (see e.g. |Schechter et al.| {2003 


burne and Kochanek! |2010)} |Dexter and Agol]} |201 1} 
2014| and Vernardos et al. (in prepration)). Mi- 


crolensing is caused by the stars in the lensing galaxy, which 
add some extra time-variable "micro-magnification" on top 
of the static "macro-magnification" produced by the lensing 
galaxy. As described in Vernardos et al. (in prepration), the 
modulation of the micro-magnification due to the relative 
motion between the quasar, the lens and the observer intro- 
duces some extrinsic variations on top of the quasar intrinsic 
variations. For this reason, the light curves, even shifted in 
time and magnitude, rarely match perfectly. These extrinsic 
variations can severely bias time-delay measurements if not 
appropriately modelled and marginalized over. 


In the past two decades, these difficulties have been pro- 
gressively dealt with. The advances in photometric instru- 
mentation in the late 1990s allowed us to acquire accurate 
and well-sampled light curves, which yielded the first robust 
time-delay measurements from optical monitoring (e. g..|Kundié 
and in radio monitoring 
2000). Although some 


of these measurements already reached an excellent preci- 
sion of a few percents, the majority had ~10-15% errors, 
hence limiting the measurement of Ho to the same preci- 
sion. These first encouraging results led to a systematic at- 
tempt to monitor a sample of lensed quasars in both hemi- 
spheres by the COSMOGRAIL program, which started in 


2003 (Courbin et al.| |2005). The observing strategy then 
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Fig. 2 R-band light curves of the lensed quasar HE0435-1223, obtained by the COSMOGRAIL program from the Euler 1.2m Swiss Telescope, 
the 1.5m telescope at the Maidanak Observatory, the Mercator 1.2m telescope, and the SMARTS 1.3m telescope. The bottom panels corresponds 
to the difference between pairs of light curves, corrected by the time delays, highlighting the microlensing variability. Figure reproduced from 


(Millon et al.||2020a). 


was to follow a dozen of lenses at bi-weekly cadence un- 
til the time delays can be measured to a few percent pre- 
cision. This strategy yielded precise measurements for the 
brightest and most variable objects in about five years 


soz et al. |2007) |2008 2011 
2013b}|Eulaers et al. /2013}|Rathna Kumar et al.|/2013) but 


required more than a decade of monitoring to obtain the 
time delays for most of the less variable and fainter tar- 
gets [2020a). An example of a light curves 
acquired by the COSMOGRAIL program over the past 15 
years is shown in Figure[2| Thanks to this long-term observ- 


ing effort and other monitoring campaigns (e.g. 
ter et al.|{2007| |Goicoechea and Shalyapin] {2016} 
et al.) |2017; |Shalyapin and Goicoechea} |2019), about 40 


lensed quasars have now known time delays, although with 
variable precision, but the sample starts to be sufficiently 
large to vastly reduce the random uncertainties and to en- 
able a statistical study of the time-delay lenses. 


As time-delay cosmography is now entering a new regime 
with an increasing number of lensed quasars being discov- 
ered every year, the time delays now need to be measured 
rapidly to turn these newly discovered systems into cosmo- 
logical constraints. demonstrated that 
it is possible to obtain accurate time-delay measurements 
in only one monitoring season thanks to the small ampli- 
tude variations of the quasars, of the order of 10 to 50 mil- 


limag, that happen on a timescale of weeks or months. These 
variations are faster than the microlensing variability, which 
varies on a typical timescale of several months or years. 
If detected at a sufficient signal-to-noise ratio (SNR), these 
features reduce the need for long light curves as it allows 
us to disentangle the intrinsic and microlensing variability 
more easily. However, this change of strategy requires al- 
most a daily cadence to obtain a sufficient sampling of these 
small features in the light curves. Their amplitude is of the 
order of 10 mmag, which requires 2-m class telescopes to 
obtain a sufficient SNR in 30 minutes of exposure at mag- 
nitude as faint as ~ 20. This is illustrated in Fig. 3] in the 
case of the bright quadruple quasar WFI 2033-4723 


2019). The technique has allowed to measure 6 
new time delays in | single season (Millon et al.| |2020c), 


with more to follow. 

In the future, the Vera Rubin Observatory will obtain 
high-SNR multiband data for all southern lensed quasars, 
opening the possibility of building a sample of a few hun- 
dreds lensed quasars with known time delays. The cadence 
will however be limited to one point every few days in each 
band, which might not be sufficient to obtain the time delays 
at a few percent precision for the most interesting targets. 
Complementary, observations from 2-m class telescope at a 
daily cadence might still be useful to refine the time-delay 
measurements of the most promising objects. 
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Fig. 3 Comparison of two observing cadences. On the two top panels are shown 14-year-long light curves of WFI 2033-4723, observed with a 
4-day cadence at the 1.2m Euler telescope at La Silla and at the SMARTS telescope at Las Campanas. During the season indicated with a red 
rectangle, the object has also been observed daily with the MPIA 2.2m at La Silla (bottom panel), unveiling exquisite small-scale structures that 
vary faster than microlensing. Such observations allow to measure time delays in | single season, with similar accuracy and precision than the 


lower cadence data over 14 years 2019}|Millon et al.|/2020a). 
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4.2 Time-delay measurements techniques 


Once well-sampled light-curves have been acquired, the next 
step consists in identifying features that can be matched in 
all light curves and measure the time delays. This steps is 
significantly complicated by the presence of extrinsic vari- 
ations due to microlensing in the light curves on top of the 
quasar intrinsic variations. The signature of micro-lensing 
can be seen in most lensed quasar light curves. In some 
cases, it manifests itself by a sharp rise of the luminosity in 
one of the multiple images, which happens when the source 
approaches or crosses a micro-caustic. This probably hap- 
pened, for example, in 2007 for image A of HE0435-—1223. 
In Figure [2| we show the light curves of HE0435—1223 as 
well as the differential curves between the images, high- 
lighting the microlensing signal. Caustic crossing events are 
not the only signature of microlensing visible in the data. 
Most quasar light curves also exhibit slow variations of the 
microlensing over several years. An example of this phe- 
nomenon is image B of HE0435—1223, which slowly raised 
by ~ 0.5 mag between 2013 and 2018. It typically happens 
when the stellar density is high and the quasar is located in 
regions where the microcaustics overlap. The net effect is 
a smooth variation of the microlensing magnification as the 
quasar moves through these crowded regions. The extrin- 
sic variations introduced by microlensing contains valuable 
information about the quasar accretion disk structure (see 
Vernardos et al. (in prepration)) but is a real source of nui- 
sance when measuring the time delays. 


To deal with this issue, several curve-shifting algorithms 
have been proposed over the years, which can be classified 
into two categories. On one hand, some methods are based 
on the light-curve cross-correlations (e.g. [1996), 
sometimes without attempting to subtract the microlensing 
variability (e.g. the smoothing and cross-correlation method 
by (2015). On the other hand, 
several techniques rely on the analytical modelling of the 
intrinsic variability of the quasars and/or microlensing vari- 


ations with, for example, splines 2013a) or 
Gaussian Processes (e.g.|Hojjati et al.}/2013). When explic- 


itly modelled, the microlensing variations are removed from 
the light curves before attempting to find the optimal time 
delays. Due to the broad band nature of the monitored sig- 
nal, mixing flux arising from multiple emission regions, mi- 
crolensing is rarely perfectly removed but this is shown to 
have in general a negligible impact on the delay 
[Tewes|/2014). One can also mention the recent work by{Tak| 
et al.|(2016), Donnan et al|(2021) and Meyer etal |(2022). 


aiming to infer the time delays in a Bayesian framework, in- 
cluding an explicit modelling of the microlensing variations. 


These methods were tested in the "Time Delay Chal- 


lenge"(TDC; |Dobler et al.) 2015), a blind data challenge 


aiming at assessing the precision and accuracy of the curve 


shifting algorithm on simulated but realistic data, which in- 
cludes the microlensing variability. The results and conclu- 


sions of the challenge are presented in [Liao et al.| (2015a) 
as well as in individual papers (Hojjati and Linder} |2014 
2016). The problem is in fact more compli- 


cated than it sounds since a large fraction of the participat- 
ing teams did not meet the requirements in term of precision 
and accuracy on the first and simplest rung of the challenge. 
Among the qualified teams to participate to the more ad- 
vanced rungs of the TDC, the different proposed techniques 
showed overall good performance given the actual quality 
of the data. Several teams reached an accuracy of <$1% on 
the most variable light curves. However, it remains to be 
checked if this level of performance holds if more realistic 
accretion disk emission mechanisms and source-size effects 
are included in the simulations. 


Among these source-size effects, microlensing time de- 
lay, which is described in details in{Tie and Kochanek|(2018), 
may be a more subtle manifestation of microlensing acting 
as a nuisance to measure the time delay. Although it has 
never been detected so far in lensed quasar light curves di- 
rectly, this effect may arise when different emission regions 
of the accretion disc are differentially magnified. A sim- 
ple model to explain the UV and optical variability of the 


quasars is the "lamp post" model (e.g. 2007 
Starkey et al.|!2017), where the luminosity fluctuations orig- 


inate close from the supermassive black hole and then illu- 
minate the rest of the disk. This triggers temperature fluctu- 
ations in the disk, which result into delayed UV and optical 
emission due to the light travel time from the center. In the 
absence of differential magnification caused by microlens- 
ing, this time lag cancels out between the multiple images 
and only the "cosmological" time delay is observed. How- 
ever, if one of the multiple image is affected by microlens- 
ing, the time lag originating from a particular region of the 
disk might be amplified by microlensing and a net excess 
of microlensing time delay can add to the "cosmological" 
time-delay. This effect could reach a few hours to a couple of 
days, which is negligible for most of the systems with long 
time delays but it can significantly increase the uncertainties 
for systems with short time delays. This effect can however 


be mitigated with multi-band light curves (Chan et al.|/2021) 


or a proper Bayesian treatment of this effect as a source of 


nuisance (Chen et al.|/2018). 


Future developments of curve shifting algorithms might 
also include time-delay measurements from unresolved light 


curves (e.g.|Hirv et al.| 2007 2021} [Biggio et al. 
2021} |Springer and Ofek |2021), which will open the pos- 


sibility to monitor small-separation (< 1’’) lensed quasars. 
While precise delays from unresolved lightcurves have al- 
ready been measured in the gamma-ray range 
[201 1} {Cheung et al.|/2014), they cannot be used for cosmog- 


raphy as the location of the gamma-ray emission w.r.t. the 
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central AGN remains unknown. In the optical range, space- 
based large sky surveys, such as Euclid, are expected to dis- 
cover thousands of small-separation systems, which will not 
be fully resolved from the ground with our current follow- 
up facilities. Turning this large sample of small-separation 
lensed quasars into a useful cosmological probe will then 
require to develop these new techniques. 


5 Determining lensing potential 


Determining the lensing potential ¢ is a crucial ingredient in 
time-delay cosmography as it directly enters the time-delay 
prediction through the Fermat potential (Equation [7). The 
Fermat potential is generally dominated by a massive el- 
liptical galaxy acting as the main deflector and intervening 
line-of-sight over- and under-densities. To achieve a precise 
and accurate cosmographic inference, knowledge of both the 
line-of-sight structure and the mass distribution within the 
main deflector need to be known. One of the major limita- 
tions in a precise determination of the Fermat potential is the 
MSD, and the inability with imaging data to constrain the 
Fermat potential. Thus, either physical assumptions based 
on what we know from other modelled galaxies, e.g. in the 
nearby universe, or external data, such as stellar kinemat- 
ics, is required to constrain the Fermat potential. The mea- 
surement of the Hubble constant and constraining the galaxy 
density profiles are tightly connected and most of the ques- 
tions asked and techniques being used in Shajib et al. (in 
preparation) are of the same relevance and applicability for 
time-delay cosmography. 

We discuss observables and inferences from imaging data 
in Section|5.1] We then discuss assumptions on mass profiles 
in Section|5.2|and what external information provide neces- 
sary constraints in Section|5.3] 


5.1 Inference from imaging data 


High-resolution imaging of gravitational lenses is crucial to 
achieve a precise determination of the relative Fermat po- 
tential between multiple images of a time-variable source. 
In this section, we discuss the necessary aspects of the mass 
distribution that imaging data can constrain, apart from the 
remaining degeneracies. 

To derive constraints on the lensing deflector from imag- 
ing data, all components that affect the imaging data need 
to be modeled and accounted for simultaneously with the 
lens model. This includes, but is not limited to, the extended 
source component of host galaxy of the AGN or the tran- 
sient, the image positions of the time variable source and 
its resulting point-like flux emission, the surface brightness 
of the deflector galaxy, differential dust extinction, and any 
other sources of surface brightness. In addition, instrument 


effects, such as the point spread function (PSF), noise (both 
shot-noise and instrumental noise), pixelization and poten- 
tial data reduction artifacts need to be accurately taken into 
account. 

The lensing effect affecting an extended surface bright- 
ness S(), such as from the extended host galaxy, can be 
computed with a method known as “backwards ray-tracing’. 
Making use of the fact that surface brightness is conserved 
through lensing, the surface brightness at a position in the 
image /(@) can be computed by the surface brightness in the 
source plane associated with the corresponding coordinate 
BO) as 1(8) = S(B(O)), where B(@) = @ — a(@) is the ‘ray 
tracing’ term given by the lens equation (Eqn.[I). 

For unresolved point-like images, the backwards ray- 
tracing is numerically inefficient. To guarantee that multiple 
images precisely come from the same location in the source 
plane within the astrometric requirements for an accurate 
time-delay prediction, the lens equation has to be solved for 
the point source constraints within the astrometric precision, 
or alternatively, solutions not satisfying the astrometric re- 
quirement (e.g.,{Birrer and Treu] /2019) need to be discarded. 

Given a lens model with parameters Emass and surface 
brightness model with parameters &jjgn, a model of the imag- 
ing data can be constructed, dodge). The full process of sim- 
ulating a modeled image can be cast as a consecutive ap- 
plication of operators as follows: starting with the surface 
brightness operator S, the lensing operator £ is applied on 
the lensed source, followed by a PSF convolution operation 
C, and finally an operator P matching the pixel resolution 
of the data, formally an integral of the convolved surface 
brightness over the size of a pixel. With this notation and © 
denoting the consecutive application of operators from left 
to right, we can write the generation of modeled data as 


A nodel = P io) Cc io) | £XGieas) © Ssource(Etignt) + Stens (Eight) | : 
(24) 


The Bayesian analysis to constrain the lens model is per- 
formed on the pixel-level likelihood of the imaging data. The 
likelihood is computed at the individual pixel level account- 
ing for the noise properties from background and other noise 
properties, such as read-out, as well as the Poisson contri- 
bution from the sources. In the Gaussian limit the imaging 
likelihood is given by 


P(Dimg | Emass > light) 


_ exp |-5 (daata = model) Final (daata _ drrodei)| (25) 


Ni (27) det(Zpixet) 


where k is the number of pixels used in the likelihood and 
pixel is the error covariance matrix. We also note that for the 
flux noise, the error covariance matrix Xpixe] is a function of 
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the brightness of the model diode; and hence not indepen- 
dent of the model prediction. Current analyses assume un- 
correlated noise properties in the individual pixels and the 
covariance matrix becomes diagonal. 

The primary target of an imaging analysis is to retrieve 
the lens model parameter posteriors marginalized over other 
model parameters, in particular the surface brightness and 
regularization parameters as 


P(E mass | Ding) = [ ring | E mass: Eligth) P(Emass> Flight) dEjignt, 
(26) 


where p(€mass> light) denotes the prior on the lens and light 
model parameters. 

Different techniques exist that have been developed to 
jointly marginalize over a complex and unknown source motr- 
phology. This consists of regularized pixelated source recon- 


sruton eg. Warren and Dye| [2003 Trev and Koopmans 
B04) Koopmans] 2005} Suyw eta 20062003), a set of 
basis functions, such as shapelets (e.g.,/Birrer et al.| 
oF wavelet 
|Galan et al.|[2021), or parameterized surface brightness pro- 


files, such as Sersic profiles. The surface brightness ampli- 
tude components of all these methods have in common that 
they create a linear response on the pixels. The maximum 
likelihood of the data given a proposed model for the am- 
plitude components is thus a linear problem, and the Gaus- 
sian covariance matrix of the linear coefficients can be used 
to analytically marginalize over the prior (e.g.,|Suyu et al. 
£2006) [Vegetti and Koopmans] 2009|[Birer etal. 2015). 

The joint sampling of lens and light model parameters 
to infer the lens model posterior distribution is then a semi- 
linear process. While the amplitudes of the light model co- 
efficients can be solved linearly, the remaining parameters, 
including those pertaining to the lens mass model, and other 
shape-related surface brightness parameters have to be sam- 
pled non-linearly. 

Often it is not clear at the beginning of an investigation 
what the level of complexity in the model is required to de- 
scribe the data and to guarantee an accurate modeling. Cur- 
rent procedures are to start with a simple model and subse- 
quently increase the complexity in the different model com- 
ponents until a satisfactory fit is achieved. Current criteria 
for a goodness of fit in use are the Bayesian Information Cri- 


teria (BIC) and the Bayesian Evidence 
(Shajib et a.[2020), 

Imaging modeling is primarily performed on high reso- 
lution space based Hubble Space Telescope (HST) or ground- 
based adaptive optics (AO; 
imaging. Figure [4] illustrates, as an example, the imaging 
data and models for two quadruply imaged lensed quasars, 


originally presented by |Shajib et al.}(2020) and 


(2019). Figure[5|presents the key lens model posteriors from 
the imaging modeling fit of the lens HE0435—1223 by|Chen| 
for both, HST and AO imaging, for a power-law 
elliptical mass distribution with external shear contribution. 

The PSF needs to be characterized very accurately, both 
to provide a high astrometric precision of the images of the 
time-variable sources 
and for the detailed modeling of the extended source 
structure without spurious signal of bright quasar images. 
Current methods to obtain a precise PSF model contain an 
iterative procedure during the model fitting process to ex- 
tract improved constraints of the PSF from the data itself 
(eg. Chen eta. 2016) Birrer et a 2076). 

The posteriors are marginalized over a set of systematic 
effects and modifications in the choice of the source recon- 
struction and other modeling choices. 


5.2 Mass profile assumptions of the main deflector 


Resolved multiply imaged structure is an exquisite data prod- 
uct to provide constraints on the relative deflection field (see 
Section [5.Ip. In the absence of knowledge of an absolute 
source size or brightness, imaging data constraints can not 
break the MST and its generalization, the source position 
transformation (SPT;|Schneider and Sluse|/2014). The quan- 
tity that is constrained by imaging data along the radial di- 
rection is 


Osa” Ora” 
Erad = 1 = = ? (27) 
-a, l-kg 


where aj, is the derivative and ay is the double derivative of 
the deflection angle at the Einstein radius 6g, respectively, 
and xg is the convergence at 6. We refer to[Birrer|(2021) for 
a discussion on azimuthal constraints. Constraining a global 
mass density profile based on imaging data alone requires 
assumptions on the radial profile. For example, when im- 
posing the assumption that the mass density profile follows 
a single power law, the power-law slope yp) has a direct cor- 
respondence to &;aq (Eqn.[27) with 24 = Ypi—2 and a precise 
(few percent uncertainty) inference on the Fermat potential 
is possible. Relaxing the assumption on the shape of the den- 
sity profile leads to significantly widened constraints. When 
allowing for a full mass-sheet degree of freedom, the Fermat 
potential is fully degenerate (i.e. &,44 remains unchanged by 
a MST). A pure mass sheet is unphysical as no localized 
three-dimensional density profile can describe it. However, 
approximate parameterizations can be found that can be ex- 
pressed as a three-dimensional density profile and are indis- 


tinguishable based on imaging data (Schneider and Sluse 
2013}|Blum et al.}|/2020 2020} Yildirim et al. 
2021). 


Figure [6] illustrates an example of how an approxi- 
mate MST can be physically interpreted when applied to a 
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Fig. 4 Illustration of imaging modeling for two lenses. From left to right: Imaging data, the reconstructed model, the reduced residuals (gata — 
dmodet)/7, teconstructed source. Top row: HST data and model for DES J0408—5354 in three bands, from |Shajib et al.| (2020), with shapelet 
and parameterized source reconstruction using the modeling software LENSTRONoMY. Bottom row: Keck adaptive optics imaging and modeling of 
HE0435-—1223, from|Chen et al.|(2019), with pixelated source reconstruction using the modeling software GLEE. 
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Fig. 5 Key lens model parameter posterior from the fit to imaging data (HST and Keck adaptive optics). q is the semi-minor to semi-major axis 
ratio in the projected mass density profile, 6g is the Einstein radius, y the three-dimensional radial power-law density slope, Yext is the external 


shear strength, and 6.x: is the shear angle. Figure adopted from (2019). 
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composite profile described with a stellar and a dark matter 
component. 

There are also possibilities in deviations in the mass den- 
sity profiles that do not directly mimic an MST. Any radial 
mass profile that satisfies the same constraints on &,,q (Eqn. 
provides an equally good fit to the datq)] Azimuthal 
assumptions of the mass density profile do also matter in 


the interpretation of the radial components 2021 
2021) and assumptions on disky and boxiness, 


ellipticity gradients and isodensity twists of the density pro- 
file may also impact the Fermat potential differences 
[Vyvere et al][2022bja] (Gomer and Williams](2020][2021), 
From a physics point of view, the matter distribution of 
the main deflector is made of stellar mass, gas, and dark 
matter, where the stellar mass is dominating the inner-most 
parts. The dark matter fraction within the Einstein radius is 
about ~ 10 ~ 60% (e-. 
(2005). Invisible substructure in the lens and along the LOS 
can also perturb the Fermat potential (e.g..{Oguri|/2007}/Kee-| 
{on and Moustakas| (2009). (Gilman et al_(2020) showed that 


omitting dark substructure does not bias inferences of Ho. 
However, perturbations from substructure contribute an ad- 
ditional source of random uncertainty in the inferred value 
of Ho ranging from 0.7 - 2.4% depending on the redshift and 
image configuration. We also highlight that the lensing mass 
and convergence only accounts for the mass over-density in 
regard to the cosmological background density. 

Different approaches running with different assumptions 
have been taken in the literature. Among the assumptions 
being used are single power-law mass profiles, composite 
models involving a mass-follows-light component with a sep- 
arate component describing the dark matter profile, free- 


form pixelated mass profiles (e.g.,{Saha and Williams] /2004} 
031), pixelated lensing 
potential corrections (e.g., (2009), or an explicit 
internal mass profile MST component 
BBirrer et al|[2020). 

There are multiple considerations in the specific choice 
of an investigation. On one side, there are physical consider- 
ations. What basic assumptions in the modeling deem justi- 
fied? What priors to chose in the Bayesian modeling? Then 
there are also practical considerations. What aspects of the 
model can be constrained by the data? Is it feasible to per- 
form a posterior inference in a finite amount of time with 
given resources? 


Among the simplest models employed is the single power- 


law profile. It has a one-to-one relation to the radial quantity 
described in Equation|27|and breaks the MST. A power-law 
elliptical mass profile is an efficient parameterization to de- 


3 &4q is the first-order Taylor expansion term affecting the observed 
radial distortions. The quantity is well defined for an azimuthally sym- 
metric lens. The generalization of the relevant radial quantity for ellip- 
tical mass models needs to be investigated. 


scribe the first order radial and azimuthal features in strong 
lensing imaging data. Composite models do relate to certain 
physical assumptions of mass follows light and assert a stiff- 
ness in the profile that implicitly also break the MSTF'] 

An explicit parameterization of the MST in the model 
denies any prior or assumptions to break the MST and is 
maximally agnostic to the MST with minimal added param- 
eter degrees. 

On the high-complexity end of lens models are free- 
form methods, such as pixelated mass distributions 
2014). Free-form models 
come with very few restrictions on the lens mass distribu- 
tion and offer a different modeling strategy compared to the 
simply parametrized approaches. The ensemble of models 
allowed by the data can be interpreted as the model poste- 
rior distribution, with the regularization scheme proposing 
models without data constraints being the prior. 

Increased flexibility in the parameterization better guar- 
antees that the underlying truth in the mass distribution, and 
in particular the prediction of the Fermat potential entering 
the time delays, can be represented by the model. On the 
other hand, increased flexibility in the model at fixed data 
constraining power increases the uncertainty in the posterior- 
predictive model. Less constraining posteriors put inevitably 
more weight and reliance on the priors applied, whether they 
are explicitly in a parameterized form, or implicit within 
an over-parameterized, free-form approach. No matter what 
choices are being made in the modeling of lenses, mitigating 
the dependence on the explicit or implicit priors becomes 
important when combining a set of multiple lenses, as we 
will discuss in Section[7.2] 

We discuss additional data sets that can constrain the 
lens mass profile in Section[5.3] 


5.3 Non-lensing observables 


The currently used primary observation to break the MSD 
is stellar kinematics from the deflector galaxy 
[Koopmans|2002|[Koopmans et al 2003][Koopmans]2004). 
The kinematics of stars, in particular their velocity disper- 
sion, is a direct and lensing-independent tracer of the three- 
dimensional gravitational potential. 

The measurement is performed by targeting stellar ab- 
sorption lines and measuring their width with spectrographs. 
Figure[7|shows a Keck/LRIS spectrum of HE0435—1223. 

The line-of-sight stellar velocity dispersion is an inte- 
grated quantity of the radial and tangential components of 
the velocity dispersion projected along the line of sight. The 
orbital anisotropy, i.e., the ratio of the radial and tangential 


* The breaking of the MST for composite models is dependent on 
imposed mass and concentration priors of the dark matter profile, as 
well as mass-to-light gradients or the absence of it. 
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Fig. 6 Illustration of a composite profile consisting of a stellar component (Hernquist profile, dotted lines) and a dark matter component (NFW 
+ cored component with A, acting as an approximate MST (from {Blum et al.|{2020), dashed lines) which transform according to an approximate 
MST (joint as solid lines). The stellar component gets rescaled by the MST while the cored component transforms the dark matter component. 
Physically, the profiles of each color differ by a 10% different mass-to-light ratio combined with a slightly more extended or contracted dark matter 
profile also on the 10% level. Left: profile components in three dimensions. Right: profile components in projection. Each profile provides a 10% 
difference in the predicted time delay, and hence Hp inference. The transforms presented here cannot be distinguished by imaging data alone and 


require i.e., stellar kinematics constraints. Figure from (2020). 


velocity dispersion components, is unknown a priori and 
thus introduces a degeneracy in the predicted line-of-sight 
velocity dispersion corresponding to the same 3D mass pro- 
file. This degeneracy is known as the mass—anisotropy de- 
generacy. Typically, a prior on the anisotropy profile, e.g., 


isotropic or Osipkov—Merritt (Osipkov}|1979 1985alb), 


is assumed. The Osipkov—Merritt profile allows the anisotropy 


to be isotropic near the center and gradually more radial far- 
ther away from the center, which is motivated by the ob- 
served properties of the stellar orbits in local elliptical galax- 
ies. The isotropic profile is thus a special case of the Osipkov— 
Merritt profile. The transition scale radius ray; from isotropic 
to radial orbits in the Osipkov—Merritt profile is a priori un- 
known, which directly manifests in the mass—anisotropy de- 
generacy. Thus, the prior on ani has strong impacts on the 
kinematics prediction (e.g.,{Shajib et al.| 
(2020). We note that there are many other forms of the radial 
anisotropy distribution and the specific choice of functional 
model used might impact the results, as well as what priors 
are adopted. One way to mitigate this degeneracy is to obtain 
spatially resolved measurement of the velocity dispersion 
instead of an unresolved (or, integrated) velocity dispersion 


fTreu)2021. 


Other proposed observations and analyses methods that 
can break the MST and provide the necessary constraints on 
the mass density profile are standardizable magnifications 


(c.g,|Kolattand Bartelmann][1998) Oguri and Kawano] 003) 
FFoxley-Marrable et al 2018} [Birrer et al] 2021), and lens 
population statistics of appearances and asymmetry in the 
multiple images (e.g., 
inenfeld) 202). 
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Fig. 7 Top: Keck/LRIS spectrum of HE0435—1223 with the best- 
fitting model overplotted in red and a polynomial continuum, which 
accounts for contamination from the lensed QSO images and template 
mismatch, shown in green. The measurement results in 7, = 222 + 15 
km s7!, including systematic uncertainties due to the templates used, 
the region of the spectrum that was fitted, and the order of the polyno- 
mial continuum. The grey vertical band represents a wavelength range 
that is excluded from the fit due to the presence of a strong Mg II ab- 
sorption system. Bottom: Residuals from the best fit. Figure adopted 


from|Wong et al.|(2017). 


We emphasize that these non-lensing observations are 
primarily sensitive to the total MST, the combination of LOS 
and internal mass density profile degeneracies (Eqn. [20). 
Decoupling of the different projected effects in the lensing 
potential is not necessary to perform an accurate cosmo- 
graphic inference since the time-delay prediction only re- 
quires the combined product. However, when combining dif- 
ferent lenses with potentially different selections, the priors 
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and assumptions imposed in either of the two components 
impacting the MST can become important. 


6 Estimating line-of-sight contributions 


The contribution of the mass density fluctuations along the 
line of sight to the lensed source is generally of order few 
percent, and commonly lower than 10% of the total conver- 
gence of the lens. While this may appear to be small, it is not 
negligible when it comes to estimating the Hubble parame- 
ter to percent accuracy. A constant effective contribution of 
a few percents caused by the line-of-sight is equivalent to an 
external mass-sheet Kext (Sect. [2). 

The exact impact of the line-of-sight objects depends on 
whether the dominant-lens approximation is valid, in which 
case the critical density of the line-of-sight objects is very 
small compared to the main deflector critical density, and on 
whether the tidal regime is applicable, which happens when 
the perturber’s gravitational field is small compared to the 


changes of the deflection a(@) (e.g. |McCully et al.) 2014 
2017) {Fleury et al.|/2021b). When one of those 


approximation is invalid, an explicit treatment is needed, 
requiring potentially to solve the multi-plane lens equation 
(we refer to Saha et al. (in preparation)). Conversely, when 
line-of-sight objects can be treated as small perturbations 
that only introduce convergence that is constant over the ex- 
tent of the lensed system, a statistical treatment is sufficient. 
In practice, a hybrid scheme needs to be followed most of 
the time, including explicitly modelling those objects that 
modify differently the Fermat potential for each lensed im- 
ages, and calculating the statistical contribution of the other 
objects that shift the Fermat potential in linear order. 

From an information perspective, there is only limited 
direct data available of the total matter distribution on the 
universe at the scales relevant ton constrain Kext. Hence, any 
method relies on some assumptions on how mass traces light. 
These assumptions are well motivated by large scale struc- 
ture probes, but are only validated in a statistical way. 

The following subsections present the various methods 
that have been considered to estimate Kext. Section[6. I]presents 
a direct modeling, Section|6.2]presents galaxy number counts 
statistics, Section|6.3] weak lensing measurements, and Sec- 


tion|6.4]a hybrid approach. 


6.1 Direct modeling 


The most direct approach is to collect the positions, red- 
shifts, stellar masses and potentially even velocity disper- 
sion measurement of the galaxies located in the field of view 
towards the lens and explicitly model the matter distribu- 
tion of all relevant objects. A complete direct reconstruction 


is near-impossible. A simple approach that has been devel- 
oped to estimate Kex¢ consists in identifying which galaxies 
form massive galaxy groups that contribute the most sig- 
nificant impact along the line of sight (e.g. 


(2017). An estimate of xx: may then be derived 
by fitting analytical mass density profiles on those groups 
(Auger etal. [2007 [Wilson etal, [2017}. This approach gen- 
erally yields estimates of kext typically uncertain to a fac- 
tor 2-4 depending on the specific assumptions one may rea- 
sonably make on the group mass density (e.g., halo associ- 
ated to each individual galaxies, a common halo for all the 
systems), but also sometimes with low precision due to the 
uncertainties associated to the group identification (field of 
view are never complete and group finders have their own 
biases). To overcome this problem, Collett et al.|(2013) have 
proposed a simple halo model prescription to reproduce the 
mass along the line-of-sight from a photometric catalogue 
of galaxies. The convergence x, from each halo has then 
been calibrated against kex, derived from ray-tracing esti- 
mate through numerical simulations. This method does not 
account explicitly for the convergence due to dark structures 
and divergence due to voids, but those effects are included 
statistically owing to the calibration of x, against Kext. 


6.2 Number counts 


An alternative to the direct modeling of the line of sight con- 
sists of measuring the galaxy number density in the vicinity 
of the lens as a summary statistics and comparing it to refer- 
ence fields, hence determining whether the LOS is over- or 


under-dense compared to average (Suyu et al.|/2010 
ZOTT} [Greene etal} 2OT3 2017). 


First, a detailed characterization of the line of sight towards 
the lens is required, using deep imaging data, complemented 
by spectroscopic data (see Fig. [8}for an illustration). Similar 
LOS are then searched for in large volume, and high reso- 
lution numerical simulations. The surface mass density of 
matter along these line of sight being calculated using a ray- 
tracing technique (Hilbert et al.|{2007|[2008| {2009}, it is then 
possible to derive a probability distribution of external con- 
vergence compatible with the observed lens matching the 
summary statistics. 

In practice, summary statistics that deviate from pure 
number counts can be a better informed statistics of the un- 
derlying over- or under- density. For example, if Ngai galax- 
ies are observed in the field of view of a lens, one can calcu- 
late a weighted number counts W, = peas q; with q being a 
particular type of weight, such as the inverse of the distance 
to the lens, i.e. 1/r. To calculate a weighted density of galax- 
ies, ¢,, it is necessary to perform the same measurement over 
an ensemble of control fields, such that for each control field 
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Fig. 8 Example of the result of the spectroscopic characterization of the field of view of the gravitational lens HE0435—1223. The redshifts of the 
objects measured in the field of view are used to identify which objects (or galaxy groups) needs to be explicitly included in the macro-model. The 
number counts analysis uses the galaxies located at projected distances of <45’’ and <120” from the lens to estimate P(k.x,). Courtesy of 
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Fig. 9 Probability distribution P(kex:) for HE0435—1223 and an aper- 
ture radius of 45”. The different curves correspond to all lines of sights 
(dotted red), considering only lines of sights with the same overdensity 
as the data (dash-dotted blue), using a weighting inversely proportional 
to the distance (dashed green) and the additional constraint from the 


shear (solid black). Courtesy of[Rusu et al.| 2017 


(CF, j) one derives a density re = W,/ Wr "J To avoid in- 
troducing any bias through this normalisation procedure, it 
is important to choose enough control fields but also ensure 
that those fields have characteristics that match closely those 
of the imaging data of the observed lens system. This allows 
one to account for sample variance and to assess that galaxy 
detection biases are similar for the lens and for the control 
fields. In particular, one should ascertain that the lens and 
control fields have similar depth, seeing, and pixel scale, the 
latter quantities being critical in the framework of source 
deblending and identification. It happens that some regions 
of the control fields are masked due to saturation of stars, 
cosmic rays, or camera defects. To guarantee unbiased es- 
timates of ¢,, it is important to apply the same mask to the 
weighted count of the lens field. A large variety of weighting 
schemes has been explored, some of them involving a proxy 
on some of the galaxy properties such as the redshift and the 
stellar mass. Those quantities are derived using a photomet- 
ric redshift code, such as LEPHARE (Ilbert et al.| 2006). This 
implies the availability of deep multi-band photometric data. 
Having a similar depth for the lens and comparison field is 
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important to matching galaxy properties in photometric sur- 
veys. Also, similar set-up for magnitude measurements are 
required to minimize systematic errors caused by aperture 
and/or object deblending uncertainties. When spectroscopic 
redshift are also available, they may generally be preferred 
to the photometric ones. 

The conversion of the probability density distribution 
P(Gq) into p(Kext) requires the use of numerical simulations 
for which kx, has been derived through ray-tracing. Large 
simulation volumes are required to minimise the impact of 
sample variance. The Millennium simulation (Springel et al. 
being dark matter only, galaxy photometric proper- 
ties are inpainted using physically motivated prescriptions. 
A common choice has been to use the semi-analytic model 
of [De Lucia and Blaizot| (2007). Densities ¢, can be esti- 
mated in those catalogues in similar way as for real data 
except that the reference fields is now the simulation cata- 


log itself. As explained in/Rusu et al.|(2017), the probability 


P(Kext | ), where d stands for the available data, is given by: 


PlKexts£q) PCy @) 
ext | @) = dé, —————————— = 
Pies 1) 2 D(C,)p(a) 


[ apples | fq)p(q 1d). (28) 
(2013) has shown that the precision on 


Kext 1s improved by a factor 2 when using a combination 
of weights, the two main ones being the standard number 
counts (q = 1), and as the inverse of the projected distance 
to the lens (g = 1/r). The addition of a weight based on the 
modeled amplitude of the shear, y.t is also often considered, 
such that the general expression of p(Kext | d) becomes: 


Plt 1A) =f de depp, dyn Pease | Ser 
bgt Ars Spon PC1 Stee Sg#A Airs Sem |G) (29) 


The addition of a fourth weight, g # 1,1/r allows one 
to evaluate systematic errors involved by the specific choice 
of equally motivated weighting schemes and explore which 
combination of weight yields the best precision on Kext. The 
first application of this technique in the framework of time- 


delay cosmography has been presented in (2010). 


Subsequent time-delay cosmography analyses from HOLICOW, 
STRIDES and TDCOSMO have used an approach that broadly 


follows the strategy outlined above (see[Greene et al.| {2013 
2017| for a more in-depth description of the 


method), but proposing small variations and tests in terms of 
weighting schemes and choices of comparison fields 
2050). 
Figure [9|displays P(Kext) aS derived with different weighting 
schemes for the lens system HE0435—- 1223. 

A novel method for estimating k-x; has recently been pro- 


posed by (2021). They replace the weighted num- 


ber count scheme by a machine learning approach. Specif- 
ically, they have trained a Bayesian Graph Neural Network 


on LSST DESC DC2 sky survey (LSST Dark Energy Sci- 
ence Collaboration (LSST DESC) et al.||2021) in order to 


derive a distribution of k.,, for arbitrary gravitational lens 
sight-line. 


6.3 Weak Lensing 


Weak lensing, the linear shape distortion of background galax- 
ies due to foreground structure, is a direct probe of the LOS 
structure. On linear scales, the cosmic shear measurements 
can be translated to convergence in a unique mapping 
[and Squires] |1993). Hence, this technique does neither rely 
on priors from numerical simulations nor of a galaxy-halo 
connection. However, there are also several drawbacks. The 
angular scale of a weak lensing measurement is limited by 
the number density of lensed sources, and a high S/N mea- 
surement can only be achieved at scales of arc minutes. Thus, 
weak lensing is an excellent observable to quantify large 
scale cosmic density distributions but other smaller scale 
density perturbations down to the scales of arc seconds are 
not well captured. Another limitation is that the weak lens- 
ing source population is not at the same redshift as the strongly 
lensed source. This requires to translate the weak lensing 
convergence map to a different lensing kernel, which comes 


with additional statistical uncertainties (e.g., 
2021). 

In the strong lensing context, for example, 
Epson] (1997); [Nakajima eta 2009 Fadel eta 2010} 


relied on the weak lensing effect produced by massive struc- 
tures in the vicinity of the deflector. They constrained the ex- 
ternal convergence by integrating the tangential weak grav- 
itational shear in the area around the lens. More recently, 
applied the weak lensing tech- 
niques to the quadruply lensed quasar systems HE0435—1223 
and B1608+656 and performed a convergence map recon- 
struction based on HST imaging. per- 
formed a convergence map reconstruction of the COSMOS 
field at the position of discovered strong lenses. 


6.4 Hybrid framework 


Given the strengths and weaknesses of the direct modeling 
and summary statistics approaches, as well as weak lensing 
measurements, a hybrid approach can leverage the comple- 
mentary methodologies. Summary statistics can be most ef- 
fectively employed for objects that mostly cumulatively af- 
fect the lensing convergence while explicit modeling of LOS 
objects makes a difference for massive or very close by ob- 
jects. The specific decision of where to split the analysis be- 
tween a statistical approach and explicit modeling is primar- 
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ily impacted by two factors. The first is the accuracy in the 
deflection properties, both in terms of higher-order lensing 
distortions and the need for a multi-plane lensing approach. 
The second is the available information and the handling of 
priors in the absence of sufficient information. 

A method to account accurately for the line-of-sight has 
been proposed by (2017). It consists 
in a multi-plane lens equation where only the planes associ- 
ated to important perturbing groups/clusters/galaxies are in- 
cluded. The other perturbers along the LOS are treated under 
the tidal approximation. In order to identify those objects, 
(2017) proposes to use a threshold based on 
the value of the flexion-shift, i.e. 43x whose expression is 
given by: 


Og On) 
Asx = f(B) x EO 


(30) 
where 6g and 6g, are the Einstein radius of the main lens 
and of the perturber, and @ is the angular separation on the 
sky between the lens and the perturber. The function f(@) = 
(1—) if the perturber is behind the main lens, and f(8) = 1 
if the galaxy is in the foreground. In that expression, f is 
the pre-factor of the lens deflection in the multiplane lens 
equation: 


a DapDos 
DopDas : 


(31) 


where Dj; = D(z,zj;) are angular diameter distances be- 
tween redshifts z; and z;, corresponding to the observer (0), 
deflector (d), perturber (p) and source (s). Missing to ac- 
count for a foreground perturber may have a stronger impact 
on the models than missing a background one. The reason 
is that the background perturber will have a multiplicative 
effect on the source position, while the deflection from the 
foreground pertuber enters the lens equation inside the ar- 
gument of the deflection of the main lens galaxy. In other 
words, the foreground perturber modifies the coordinates of 
the lensed images positions compared to the main lens case. 
These non-linear effects require a multi-plane treatment to 
be properly accounted for. From a set of simulation of time- 
delay lens systems resembling real ones, and their subse- 
quent modeling based on point-source image positions, [Mc-| 


Cully et al.|(2017) suggests that a value 43x < 1074 arcsec 


yields to a bias on Hp of less than a percent. Since [Sluse| 
let al.|(2017), this prescription is used by the HOLICOW and 
TDCOSMO collaborations to select the objects that they ex- 
plicitly include in the lens mass modelling. 

OTT: combined the 
study of the environment using the halo-rendering approach, 
i.e. linking the galaxy stellar masses to the underlying mass 
distribution, with the external shear measurements of the 
strong lens system. Their combined approach yielded tighter 


constraints on the inferred external convergence compared 
to a halo-rendering approach only. 


7 Cosmographic inference 


Having established the necessary observations and analy- 
ses components in the previous sections, in this section we 
discuss how an end-to-end combined analysis leads to con- 
straints on Ho and other relevant cosmological parameters. 
First we discuss the analysis for a single lens (Section [7.1) 
and then state the analysis for a set of multiple lenses (Sec- 


tion[7.2). 


7.1 Single lens cosmography 


For each individual strong lens, there are preferrably four 
data sets available: (i) imaging data of the strong lensing fea- 
tures and the deflector galaxy, Dimg; (2) time-delay measure- 
ments between the multiple images, D,q; (3) stellar kine- 
matics measurement of the main deflector galaxy, Dspec; (4) 
line-of-sight galaxy count and weak lensing statistics, Dios. 
These data sets are independent and so are their likelihoods 
in a joint cosmographic inference. Hence, we can write the 
likelihood of the joint set of the data 


D= {Dimg, Dua, Dspec, Dios} (32) 


given the cosmographic parameters {Dg, Ds, Das} = Dass 
as 


LDDass) = f LDinl gas ig 


x L(DialEmass; light, A, Dat) 
x L(DspeclEmass Flights Panis Aa, Ds /Das)£(DioslKext) 


x P(Emass: Sight Aint» Kext» Bani dE mass UE tight A int KextABani- 
(33) 


In the expression above we only included the relevant model 
components of the individual likelihoods. €jignt formally in- 
cludes the source and lens light surface brightness°| 

The sampling of the cosmographic posterior from the 
joint likelihood of Equation (33]can be split in parts to sim- 
plify the problem. For example, we can first perform the 
imaging analysis providing constraints on €jens and &jjgn, With- 
out sampling the cosmological or distance parameters. In 
turn, simple sampling the €jens and &jjoh¢ posteriors in post- 
processing when evaluating the time-delay likelihood and 
stellar kinematic likelihood can translate the posteriors into 
distance posteriors in Dy, — Da space. Marginalization over 
different modeling choices can also be done in the Dy, — Da 


5 To evaluate the time-delay likelihood, we require the time-variable 
image positions from the set of &jgn parameters. 
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posterior space. Figure[10|provides an example of D4, — Da 
for a set of different modeling choices. 

Weighting the posteriors of different models can be done 
with Bayesian model comparison. This weighting also al- 
lows one to combine models in a single posterior, which 
then includes systematics considerations. Discrete and fi- 
nite choices made in the models and scatter in the sampling 
and BIC calculation can lead to over-constraint model selec- 
tions. Procedures to take noise and finite model selection in 
the BIC estimate into account have been developed 
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Fig. 10 Median-subtracted and mean-divided relative angular diame- 
ter distance posteriors, D4; and Dg for a power-law mass density pro- 
file with three different source reconstruction settings. Max in this fig- 
ure refers to the polynomial order in the shapelets with the set of three 
numbers corresponding to three different sources being simultaneously 


modelled in the image. Figure adopted from|Shayib et al.|(2020). 


7.2 Population level analysis 


The overarching goal of time-delay cosmography is to pro- 
vide a robust inference of cosmological parameters, a, and 
in particular the absolute distance scale, the Hubble constant 
Ho, and possibly other parameters describing the expansion 
history of the Universe (such as Q, or 2), from a sample 
of gravitational lenses with measured time delays. 

In Bayesian language, we want to calculate the proba- 
bility of the cosmological parameters, z, given the strong 
lensing data set, p(a|{D;}y), where D; is the data set of 
an individual lens (including imaging data, time-delay mea- 
surements, kinematic observations and line-of-sight galaxy 
properties) and N the total number of lenses in the sample. 

In addition to 2, we denote € all the model parameters 
part of either a single lens analysis (Section|7.1) or present 


on the population level. Using Bayes rule and considering 
that the data of each individual lens D; is independent, we 


can write, following |Birrer et al.| (2020): 


pla | {Ditw) « LUDijwlt) pa) = { LUDiw |, ple, Ode 


N 
= | [| £@imorcedg. 64 


In the following, we divide the nuisance parameter, €, 
into a subset of parameters that we constrain independently 
per lens, &;, and a set of parameters that require to be sam- 
pled across the lens sample population globally, €o). The 
parameters of each individual lens, &;, include the lens model, 
source and lens light surface brightness and any other rele- 
vant parameter of the model to predict the data. Hence, we 
can express the hierarchical inference (Eqn. [34) as 


por (Din) fT] [LBs | Dasal.6in€ eG] 


p(n, {Eihn, Epop) 
Hip) 


where {&}y = {&1, &2, ...,Ey} is the set of the parameters 
applied to the individual lenses and p(€;) are the interim pri- 
ors on the model parameters in the inference of an individual 
lens. The cosmological parameters a are fully encompassed 
in the set of angular diameter distances, {Dg, Ds, Das} = Da.s.as; 
and thus, instead of stating a in Equation [35] we now state 
Dajs.as(%). Up to this point, no approximation was applied to 
the full hierarchical expression (Eqn. [34). 

Key differences among different inferences of Hp from 
a set of lenses involve, beyond the assumptions on individ- 
ual lenses, assumptions on the covariant nature and the prior 
on the population level of the governing hyper-parameters. 
For example,{Wong et al.|(2020) assumes full independence 
of the nuisance priors from one lens to another. Formally, 
within Bayes Theorem, this approach assumes perfect knowl- 
edge of the governing population hyper-parameter distribu- 
tion prior (Eqn.[35). In this approach, the distance posteriors 
of individual lenses can be interpreted as measurements and 
the cosmographic analysis can be done in solely operating 
in the Dy, — Da space with a direct independent and easy 
accessible likelihood description. 

(2020b) performed an analysis exploring 
the difference between two different radial mass density pro- 
file families, assuming that either all lenses are of one type 
or another, effectively treating one modeling choice as a co- 
variant nuisance parameter in their inference while keeping 
all other priors independent with an assumed population. 

is using a free-form approach in 
the modeling of individual lenses. The ensemble of mod- 
els allowed by the data for an individual lens is providing 


AE i}dE pop (35) 
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the model posterior distribution. The underlying regulariza- 
tion scheme is the implicit prior applied on individual lenses. 
The identical regularization scheme is applied to all lenses 
assuming independence in the priors without covariances 
in the choice of the regularization scheme between lenses. 
(2021) did not use any external information to 
break the MST. Hence, the specific choice of the regulariza- 
tion scheme with their underlying physical and regulariza- 
tion priors is responsible for the breaking of the MST on the 
population level. 

introduced the hierarchical analysis 
framework into time-delay cosmography and identified few 
key parameters, that on a per lens basis are not sufficiently 
well constrained and thus the population prior can signif- 
icantly affect the outcome of the analysis. The parameters 
hierarchically sampled, beyond the cosmological ones, were 
the MST population Aint (Eqn.[21), and the stellar anisotropy 
distribution (see Section|5.3). 

[Park|implemented a Bayesian hierarchical framework to 
determine the external convergence distribution on the pop- 
ulation level for a full sample of lens systems used for time- 
delay cosmography and. 

The required population-level description of priors, in 
particular of parameters that can not be constrained to high 
precision (overcoming the prior in the analysis) do also need 
to take accurately into account potential differences among 
subsets of the population. For example, different lens dis- 
covery channels might preferentially select a different lens 
and line-of-sight population. 


8 Future constraints from galaxy clusters 


To date, galaxy-scale lenses have dominated the literature 
on Ho determination in the number of measurements and 
precision. Soon we may expect competitive constraints from 
galaxy clusters in measuring Hp and other cosmological quan- 
tities. Massive clusters are rich in multiple images and have 
definite advantages over individual galaxies. The main one 
is that sources at multiple redshifts break the mass-sheet or 
steepness degeneracy (e.g., (2004), which is 
the main degeneracy and hence source of uncertainty affect- 
ing galaxy-scale determination of Ho (see also Section [5p. 
Being extended on the sky, clusters are also more likely to 
have supernovae as background sources, whose time delays 
are relatively easily determined to a few percent precision, 
rivalling time-delay determinations from quasar sources in 
galaxy-scale lenses, but without the need for years of mon- 
itoring campaigns to obtain lightcurves. The drawback of 
clusters is that their mass distributions are more complex: 
they are dynamically younger than galaxies, and their multi- 
ple image regions sample a much larger fraction of the clus- 
ters’ virial radius than in galaxies. Therefore the multiple 
image region of clusters is expected to be more abundant 


in substructure, and hence harder to model. These difficul- 
ties can be circumvented if there are few or several hundred 
multiple images, then Hy can be estimated to a 1-few % pre- 
cision (Ghosh et al.| 2020). At present, in a cluster lens like 
MACS 1149, one can estimate Ho to 6%, assuming a con- 
servative 3% uncertainty on the observed time delay 
078). 

The first cluster lens to produce a precise estimate of Ho 
will likely be MACS 1149, where the first confirmed multi- 
ply imaged supernova was observed a few years ago 
fet al.|/2015). The long time delay before the reappearance of 
the last arriving image—saddle in the arrival time surface of 
the cluster—allowed the lensing community to make model 
predictions for the time of the reappearance. Most models 


agreed reasonably well on 250-350 day delay (Kelly et al. 
2016 2016). These are encouraging findings. 


9 Current status and results 


The HOLiCOW collaboration inferred 
from the independent analysis of six lensed quasar systems 


a Hubble constant value of Hy = 73.37}; km s-!Mpc"!, de- 
scribing deflector mass density profiles by either a power- 
law or stars (constant mass-to-light ratio) plus standard dark 


matter halos (Wong et al.||2020). This is a 2% precision on 


Ho, in excellent agreement with the local distance ladder 


measurement by the SHOES team 2019} [202 1) 


and more than 3o statistical tension with early-Universe probes 
(ex. 2020). 
The STRIDES collaboration presented an additional lens 
with the most precise single-lens measurement of Hp = 74.2%27 
km s~'Mpc™! with the same mass profile assumptions as the 
HOLiCOW collaboration (Shajib et al.|[2020). In sum, if the 
mass density profiles are well described by a power-law or a 
baryonic component with a constant mass-to-light ratio plus 
a NFW dark matter hald®] and under 
the assumption that the covariance are negligible, the tension 
is significant from the strong lensing measurements alone, 
corroborating other measurements, and new physics may be 
required. 


Given the important implications of the tension, the HOLiCOW, 


STRIDES, and SHARP collaborations, now TDCOSMO col- 
laboration, is undergoing a systematic investigation of possi- 
ble systematic effects. [Millon et al.|(2020b) found, combin- 
ing six lenses from HOLiCOW, SHARP and STRIDES, that 
the results when assuming that all lenses are either one or the 
other of the two previously assumed forms of the mass den- 
sity profile are valid. The good agreement in the Hp results 


© Imposing standard priors on the mass and concentration of the 
halo. 
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between power-law and composite profiles was interpreted 
by as a consequence of the ‘bulge- 
halo conspiracy’ that the combined baryonic and dark matter 
density components form a power-law profile (e.g., 
£2008). 
analyzed 8 quadruply imaged quasars with a 
free-form modeling approach and obtained Hp = 71S 
km s-!Mpc"!. investigated the effect 
of unaccounted for subhalos and small undetected line-of- 
sight halos in the uncertainty budget and found insignificant 
residual uncertainties to mitigate the tension of the measure- 
ments with the CMB and large scale structure probes. 
showed that a variety of expected 
azimuthal structures in the mass distribution (i.e. multipoles, 
twists and ellipticity gradients) should leave Ho unaffected 
at the population level unless there are specific selection ef- 
fect in the galaxy population. 


The attention thus turned to relaxing the radial profile 
assumption (see Section[5.2) and the covariant treatment of 
population priors that cannot be constrained on a lens-by- 
lens basis. addressed the issue in the 
most direct way, by choosing a parametrization of the radial 
mass density profile that is maximally degenerate with Ho, 
via the MST. With this more flexible parametrization, Ho 
is only constrained by the measured time delays and stel- 
lar kinematics, increasing the uncertainty on Ho from 2% 
to 8% for the TDCOSMO sample of 7 lenses resulting in 
Ho = 74.5*?° km s~'Mpe™', without changing the mean in- 
ferred value significantly. 


(2020) introduced a hierarchical framework 
(see also Section{7.2] but now for different datasets) in which 


external datasets can be combined with the time-delay lenses 
to improve the precision. They achieved a 5% precision mea- 
surement on Hp by combining the TDCOSMO lenses with 
stellar kinematic measurements of a sample of lenses from 
the Sloan Lens ACS (SLACS) survey with no time delay 


information (Bolton et al.| 2008 2009 
2021) and measure Hy = 67.43} km s-'Mpce™'. 


The mean of the TDCOSMO+SLACS measurement is off- 
set with respect to the TDCOSMO-only value, in the direc- 
tion of the CMB value, although still statistically consis- 
tent given the uncertainties. The mea- 
surements are in statistical agreement with each other and 
with the earlier HOLiCOW/ SHARP/ STRIDES measure- 
ments based on radial mass profile assumptions. {Birrer et al.| 
is also consistent, by construction, with the study by 


Shajib et al.|(2021), since they share the same measurements 
for SLACS. |Shajib et al.|(2021) concluded that NFW-+stars 


(using wider priors on mass and concentration than earlier 
HOLiCOW/ SHARP/ STRIDES measurements) is a suffi- 
ciently accurate description of the mass density profile of 
the SLACS lenses. However, small departures from those 
forms are allowed by the data, resulting in the uncertainties 


quoted by {Birrer et al.| (2020). The shift in the mean could 


be real or it could be due to an intrinsic difference between 
the deflectors in the TDCOSMO and SLACS samples, aris- 
ing from selection effects. For example: the two samples are 
well matched in stellar velocity dispersion, but they differ in 
redshift; the TDCOSMO sample is source selected and com- 
posed mostly of quadruply imaged quasars, while SLACS is 
deflector selected and dominated by doubly imaged galax- 
ies. 


10 Outlook in the (near) future 


The goal of time-delay cosmography is to provide a robust 
measurement of the Hubble constant to 1% precision to de- 
cisively tell the outcome of the currently observed tension 
between late and early time measurements of Ho. In the pre- 
vious Section 9 we presented current results. In this sec- 
tion, we discuss the potential of the time-delay method in 
the near future. First, we describe the data and instrumen- 
tation which enable us to push ahead (Section [I0.I). Sec- 
ond, we will highlight avenues where continuing work is 
required in assessing the methodology to maintain accuracy 
while increasing the precision of the measurements (Sec- 
tion|10.2). Finally, we leave some concluding remarks about 
the prospects of time-delay cosmography in Section [10.3] 


10.1 Future data 


On the full sky, we expect there to be several 10’000 galaxy- 
galaxy lenses and several hundred quadruply lensed quasars 
(eg. Oguri and Marshall 2010 (Colfet]2015}. With the up- 
coming wide and deep ground- and space-based surveys, we 
expect many of those to be discovered within a decade by the 
Vera Rubin Observatory, Roman Observatory, and Euclid. 
This is an e-folding of the number of lenses possibly suit- 
able for time-delay analyses compared to the current analy- 
ses conducted on few lenses (e.g., 7 lenses in case of current 
TDCOSMO results) and will transform the measurements 
and approaches in the domain of time-delay cosmography. 
The first step in utilizing these lenses is to discover them 
in large data sets. We refer for techniques, recent successes 
and an outlook in these regard to McMahon et al. (in prepa- 
ration). The next step is to acquire all the necessary follow- 
up information, from monitoring data for a time-delay mea- 
surements, high-resolution imaging, to spectroscopic infor- 
mation about the source and lens redshift as well as velocity 
dispersion of the deflector. This step is going to be challeng- 
ing with limited resources and there needs to be made deci- 
sions which lenses being excessively followed-up and which 
ones left aside. We comment in Section [i0.2|about develop- 
ments of methodology that can deal with less constraining 
or incomplete data for a larger lensing data set. Some lenses 
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might require less substantial follow-up in case where LSST 
light curves are good enough for a time-delay measurement 
(Liao et al.|{2015b), and or where high-resolution and suffi- 
ciently high signal-to-noise ratio data exists from wide field 
space surveys, such as Euclid or Roman. 


The key to assess the need for follow-up and on which 
lenses to spend it, is to what extent these data sets impact 
the precision on Ho. Follow-up decisions, besides the lim- 
ited resources, are currently also impacted by the accessi- 
ble to adaptive optics (AO) coverage. With next-generation 
AO instrumentation on both hemispheres, we expect a full 
sky coverage of instrumentation that allows the community, 
at least from a technical view point, to target every single 
gravitational lens on the sky. 


The dominant uncertainty in the current measurement of 
the Hubble constant with strong gravitational lensing time 
delays is attributed to uncertainties in the mass profiles of 
the main deflector galaxies. There are several independent 
avenues of data available in the near future to approach a 
1% measurement of Ho that we focus in this section. 


Spatially resolved kinematics of the deflector galaxy (see 
Section [5.3] for details on methodology) with the next gen- 
eration space (James Webb Space Telescope; JWST) and 
ground-based (extremely large telescopes) instruments pro- 
vides precise measurement of the kinematics and have the 
ability to break the mass-anisotropy degeneracy, a currently 
limiting systematic when using integrated kinematic mea- 
surements. (2021) forecasts that with 40 
time-delay lenses with exquisite spatially resolved kinemat- 
ics, a 1.5% precision on Ho can be achieved without relying 
on mass-density profile assumptions to break the MST (Fig- 
ure[IT]left graphic), and see also e.g. [Yildirim et al.| (2020) 
(2021). Such a strategy with exquisite data on the sample of 
time-delay lenses is one way to make progress. Another ap- 
proach is to infer the mass density profile properties from a 
larger set of non-time-delay lenses and apply the constraints 
on the mass density profile and stellar anisotropy distribu- 


tion on the time-delay lenses 2020} 
2021 2022). In particular, resolved spec- 


troscopy can also be employed on non-time delay lenses 
without bright and contaminating quasar images, either as 
prior constraints or by directly incorporating into a hierar- 
chical analysis, to further improve the kinematic measure- 
ment precision. 


Standardizable magnifications with gravitationally lensed 
supernovae (glSNe) provide another promising avenue to 
constrain the MST in the near future with the onset of LSST. 
(Figure[Ii]right panel) provides a fore- 
cast with glSNe in constraining Hy independently of stellar 
kinematics. They conclude that the standardizable nature en- 
ables a 1.5% Ho measurement with a 10 years LSST survey. 
On the discovery, expected number of glSNe, the challenges 


of following them up and the caveats of micro-lensing we 
refer to Suyu et al. (in preparation). 

Another method is to make use of the statistical distribu- 
tion of images under the assumption of knowing the distri- 
bution of sources in the source plane with a statistical combi- 
nation of a large sample of time-delay lenses, relying purely 


on strong lensing data (Sonnenfeld) |202 1b). 


10.2 Methodology improvements 


With the expected wealth of data and the increase in the 
number of time-delay and non-time-delay lenses, the prospect 
of measuring Ho to 1% precision can become a reality. The 
employed methodology and assumptions must keep up to 
provide the accuracy requirement. In the following we dis- 
cuss methodology improvements and validations in the do- 


main of galaxy density profiles (Section |10.2.1), assump- 
tion in the interpretation of non-lensing constraints (Sec- 


tion [10.2.2), selection effects (Section [10.2.3), automatiza- 
tion (Section [10.2.4) and general aspects of methodology 
verification (Section [10.2.5). These sections are not meant 
to be complete but to provide guidance in the near future on 
where focused effort is required. 


10.2.1 Galaxy density profiles 


Mass profile assumptions and translating constraints from 
one profile form to another leaving the predicted observ- 
ables invariant. The currently employed models mitigating 
the MST effect is parameterized with a pure MST parame- 


ter 2 2020). This parameterization is purely of 


mathematical nature and leaves the physical interpretation 
(e.g., ambiguous, or, in certain regimes 
even unphysical with e.g. mass profiles with negative den- 
sity in the outskirts. Such a one-parameter extension to pre- 
viously considered more simple and rigid mass profiles may 
also not encompass the necessary flexibility beyond the pure 
MST that can affect kinematics observations (e.g., 
(2021). To make progress, the 
full degeneracy of the MST needs to be folded into flexi- 
ble, but physically motivated, mass profile parameters, an 
approach explored by (2021), but not yet em- 
ployed for time-delay cosmography. Quasar microlensing 
studies might also help to constrain the stellar mass to light 
ratio in massive elliptical galaxies. Ambitious measurements 
below the 10% level might additionally help to constrain the 
mass density profiles and would allow the focus on the dark 
matter portion of the profile. We refer to Vernardos et al. (in 
prepration) for techniques and prospects of this methodol- 


ogy. 
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Fig. 11 Forecast for Hp measurements in the near future with the upcoming ground- and space-based facilities. Left: Spatially resolved kinematics 


measurements of a sample of 40 time-delay lenses enable a precision on Ho of 1.5% (Figure adopted from (2021)). Right: 
Standardizable magnification measurements of ~ 144 gravitationally lensed supernovae enable a precision on Ho of 1.5% (Figure adopted from 


2021)). Both approaches do constrain the MST with independent observations. 


10.2.2 Non-lensing constraints 


Significant constraints on the MST, and in general mass den- 
sity profiles, are expected to come from non-lensing observ- 
ables. These measurements, as well as their model interpre- 
tation, need to be tested to the percent level. For example, for 
the kinematic measurements, the impact of stellar template 
fitting needs to be further assessed and validated. For the 
interpretation of the measurements, de-projection assump- 
tions, rotational structure, as well as stellar anisotropy need 
to be rigorously tested and assessed for covariant system- 
atics on the population level. For upcoming magnification 
measurements with glSNe, micro- and milli-lensing effects 
need to be assessed and incorporated into the model self- 
consistently. Furthermore, more knowledge about the struc- 
ture and size of the variable quasar accretion disc are re- 
quired to determine the strength of the micro-lensing time 
delay effect. 


10.2.3 Selection effects 


Strong lenses are inherently tracing a narrow and rare dis- 
tribution of matter in the universe. Quantifying the selection 
effects, including the differential selection effects among dif- 
ferent samples of lenses is going to be crucial to maintain 
accuracy in the years to come. Selection effects can impact 
the line-of-sight distribution, the main deflector mass den- 
sity and ellipticity, the galaxy properties of the deflector as 
well as the source, and projection effects. Many of these ef- 
fects can not precisely be quantified on a lens-by-lens basis. 

For example, quadruply lensed quasars are visible only 
when the source quasar lies within the diamond caustic of 


the lensing galaxy. This condition creates a Malmquist-like 
selection effect in the population of observed quadruply lensed 


quasars, increasing the true caustic area (Baldwin and Schechter 


2021).|Collett and Cunnington|(2016) simulated a sample of 


double- and quadruple-image systems and when assuming 
reasonable thresholds on image separation and flux, based 
on current lens monitoring campaigns, they found that the 
typical density profile slopes of monitorable lenses are sig- 
nificantly shallower than the input ensemble. 


There are two approaches to mitigate selection effects. 
First, one can try to understanding selection from first prin- 
ciple and explicitly account for the theoretical selection func- 
tion in the analysis procedure. This approach requires ex- 
tensive simulations and a reproducible selection function in- 
cluding the discovery channel and follow-up decision. Sec- 
ond, one can empirically measure selection functions from a 
set of observables at hand with assumptions of self-similarity 
among galaxies and line of sights with identical properties, 
such as stellar mass, morphology, redshift and environment, 
and explore empirical scaling relation among them. We refer 
to Section|6]for data and approaches to quantify line of sight 
effects. We also stress that these techniques rely on underly- 
ing priors on the population and an explicit de-biasing is re- 
quired to constrain hierarchically unknown selection effects 
(see €.g., [Park). Currently, neither of the two approaches 
have been successfully applied. 


With the anticipated large number of lenses in the near 
future, and the more uniform data set of large and deep sur- 
veys, both approach will become feasible and we advocate 
analyses that take into account the specific discovery chan- 
nel in the analysis. 


24 


Birrer et al. 


10.2.4 Automatization 


Current state-of-the-art analyses of single lens systems takes 
up more than a year of work, with the involvement of many 
people, as well as several hundred of thousands of CPU 
hours of computational cost. To utilize the upcoming larger 
lens samples and to achieve a high-precision Hp measure- 
ment, the time to analyse a single system has to be reduced 
significantly. Automated decision-making and model choices 


(e.g.,/Schmidt et al.||2022 2022), as well as GPU 
assisted computations (e.g., 2022) hold promises 


in these regards. Moreover, analyses have to be able to be re- 
peated with modifications to test for assumptions covariant 
among all lenses multiple times. The faster the entire analy- 
sis runs, the more explorations of potential systematics in the 
choices can be executed. The challenge in finding uniform 
analyses choices are that every lens is different from another 
and particularities have been notices that needed special at- 
tention for lenses on the individual basis. The analyses con- 
ducted need to be uniform in their choices and approaches 
such that impacts on assumptions can be tested on the en- 
semble level. Uniformity of analyses can also reduce human 
errors and sets the analyses on quantifiable priors. 

There is currently an effort in homogenizing the analysis 


procedure, for both time-delay lenses (Shajib et al.||2019 
Schmidt et al.| |2022} 2022) and non-time delay 
lenses (Shajib et al.|/2021) and further effort is underway. In 


parallel, alternative methodology in the modeling and pos- 
terior inference are being explored with machine learning 
techniques, which have the potential to speed up the analy- 


sis by orders of magnitude (e.g., 2021) 
10.2.5 Methodology verification 


Guaranteeing accuracy with ever more precise measurements 
is a challenge throughout the cosmological community. High- 
precision measurements of quantities to relevance of funda- 
mental physics is a relatively new field and we dedicate a 
separate subsection highlighting different strategies to ver- 
ify the methodology and to perform to the necessary quality 
standard to maintain accuracy. 


— Realistic simulations offer a validation of a methodol- 


ogy on a known truth (see e.g., [Xu et al.] 2016 
2018). It is important that the complexity in the 


simulations are realistic to explore avenues of potential 
systematics and gain a deep understanding of what data 
products are able to constrain what aspects of the model. 
Simulations eventually need to encompass all aspects of 
the analysis, including the selection effect and the entire 
line-of-sight structure within the full cosmological and 
astrophysical context. 

— Data modeling challenges, such as the Time-Delay Chal- 


lenge 2015a) and the Time-Delay Lens Mod- 


eling Challenge (Ding et al.| |2021) offer platforms to 


validate currently employed methodology on mock data 
sets, explore new ways of analyzing the data and can 
provide a transparent overview of the current state of the 
field. 

— Blind analyses prevent experimenter bias. The analysis 
should be guided by the assessment of uncertainties re- 
gardless of the anticipated result. Blind analyses have 
regularly been performed by the HOLiCOW and TD- 
COSMO collaborations. 

— Open source accessibility of the raw data, processed data 
product, analysis software and entire end-to-end analysis 
pipelines can best guarantee reproducibility, form com- 
munity trust and provides access to the community to 
alter and improve existing methodology. 


10.3 Concluding remarks 


Time-delay cosmography has an exciting time ahead. The 
method has come along way since its original proposal by 
(1964). Current measurements of the Hubble con- 
stant with time-delay cosmography are at the few percent 
level, enabled by detailed analyses and precise measurements 
of different aspects of the analysis. With the expected in- 
crease in the lensing sample and the advances in instrumen- 
tation, the path towards a percent precision measurement of 
Ho becomes in reach. 

Measuring the Hubble constant to percent level preci- 
sion is a challenging endeavor, regardless of the cosmolog- 
ical probe. In this manuscript, we aimed to provide a de- 
tailed account of the methodology and measurements to pro- 
vide guidance to achieve a precise and accurate measure- 
ment of Ho at the one-percent level. We emphasized the 
challenges and systematics in the different components of 
the analysis and strategies to mitigate them. Above all, in 
Carl Sagans words: "Extraordinary claims require extraor- 
dinary evidence”. 
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